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. . . Abstract 

00 
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f^i ' We present constructions of Space-Time (ST) codes based on lattice coset coding. First, we focus 

(N' 

MHi slightly larger than the number of transmit antennas. We present constructions based on dense lattice 



on ST code constructions for the short block-length case, i.e., when the block-length is equal to or 



packings and nested lattice (Voronoi) shaping. Our codes achieve the optimal diversity-multiplexing 
tradeoff of quasi-static MIMO fading channels for any fading statistics, and perform very well also at 
practical, moderate values of signal to noise ratios (SNR). Then, we extend the construction to the case 
of large block lengths, by using trellis coset coding. We provide constructions of trellis coded modulation 
(TCM) schemes that are endowed with good packing and shaping properties. Both short-block and trellis 
constructions allow for a reduced complexity decoding algorithm based on minimum mean squared 
error generalized decision feedback equalizer (MMSE-GDFE) lattice decoding and a combination of this 
with a Viterbi TCM decoder for the TCM case. Beyond the interesting algebraic structure, we exhibit 
codes whose performance is among the state-of-the art considering codes with similar encoding/decoding 
complexity. 
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I. Introduction 

The quasi-static, frequency-flat fading (complex) multiple-input multiple-output (MIMO) channel with 
M transmit and N receive antennas and coding block-length T channel uses is described by 

Y c = H C X C + W c , (1) 

where X c denotes the M xT transmitted codeword matrix drawn from a space-time (ST) code X, Y c is 
the N xT received signal matrix, H c is the N X M channel matrix and W c is the N xT noise matrix. 
The entries of the channel matrix H c are assumed to be constant over a block length of T channel uses 
and the entries of W c are independent and identically distributed complex Gaussian with zero mean and 
unit variance, i.e., i.i.d. GN(0, 1). The results of this paper will hold for arbitrary channel fading statistics, 
but we will use the standard i.i.d. Rayleigh fading model for our simulations, in which case the entries 
of H c are i.i.d. 6N(0, 1). The input constraint 

E||X C ||| < T SNR (2) 

is enforced, where E(-) denotes the expectation operator and SNR takes on the meaning of the transmit 
signal-to-noise ratio (total transmit energy per channel use over the noise power spectral density). The 
channel matrix H c is assumed to be known perfectly at the receiver but not at the transmitter. 

The use of ST codes over MIMO channels is known to provide two kinds of benefits: better reliability 
through diversity gain, and higher data rates in terms of multiplexing gain. The diversity-multiplexing 
tradeoff (DMT) (see [9] for the definition and details) captures in a succinct and elegant way the tradeoff 
between these two quantities in the high signal to noise ratio (SNR) regime. The DMT specifies the 
maximum possible diversity that can be obtained at each possible value of multiplexing gain, and has 
become a standard performance metric to evaluate ST schemes, and a tool to compare different ST 
schemes. 

Families of codes that achieve the DMT of MIMO fading channels have been proposed. Perhaps the 
most notable in terms of performance and generality are Lattice ST (LaST) codes and codes obtained 
from cyclic division algebras (CDA). 

An ensemble of randomly generated LaST codes was shown to be DMT optimal under minimum mean 
squared error generalized decision feedback equalizer (MMSE-GDFE) lattice decoding for T > M+N— 1 
[1]. In this case, DMT optimality is shown in a random coding sense (i.e., with respect to error probability 
averaged over the random lattice ensemble) and for the Rayleigh i.i.d. fading statistics. 

Families of carefully constructed CDA codes enjoy the so-called non-vanishing determinant (NVD) 
property (to be defined subsequently), which in turns implies that these codes, under ML decoding, achieve 
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the optimal DMT in a universal sense, i.e., over any channel fading statistics [2]. Codes achieving the 
optimal DMT over any fading statistics are called "approximately universal" in [3]. Furthermore, these 
codes allow for minimum block length, i.e., there exist optimal codes for all T > M [2]. 

In some sense, the present work may be thought of as a confluence of these two approaches. We 
construct codes that retain desirable properties from both families: not only are they are non-random 
explicit constructions from CDAs, but they also employ the nested lattice construction that enables shaping 
gains and the reduced complexity MMSE-GDFE lattice decoding akin to the LaST codes. 

The DMT captures the optimal performance for high SNR. Following [1], [2], attention has shifted 
towards constructing ST codes that not only achieve the DMT, but also perform well at finite (practical) 
values of SNR. For example, generating codes at random from the ensemble of [1] yields typically 
performances that stay at 1 to 3 dB from outage probability (that can be regarded an effective "quasi- 
lower bound" on the performance of any code at meaningful SNR, i.e., for probability of block error 
not too large (say, < 10 -1 )). In this perspective, the first part our this work presents a construction of 
structured LaST (S-LaST) codesj that achieve the DMT and perform well at finite SNR, for small to 
moderate block-lengths (i.e., T is equal to or slightly larger than M). In the second part of the paper 
we turn to the case of large block lengths T 3> M. This is motivated by the fact that in practical 
wireless communication systems, information is encoded and sent over the channel in packets, together 
with training symbols, protocol information, and guard intervals. Therefore, packets cannot be too small, 
for otherwise the overhead would be a large part of the overall capacity. We target the case where data 
packets span a number of channel uses T considerably larger than the number of transmit antennas M, 
but nevertheless smaller than a fading coherence interval. Then, the fading channel is constant over the 
whole codeword of duration T channel uses. 

Unfortunately, the LaST and/or CDA constructions do not generalize, in practice, to T ^ M since the 
decoding complexity grows rapidly with T. Furthermore, with constructions such as those in [1], [2] it 
is not clear how to exploit the large block length to obtain codes with improved coding gain. Therefore, 
the challenge here is to design ST codes for large T that have good coding gain and low decoding 
complexity. In this regard, the authors in [21] have proposed a trellis coded modulation (TCM) scheme 
based on partitions of the Golden code [11]. For prior work on ST TCM, see [18], [19]. Building on 
these ideas, we propose a general technique for the construction of ST-TCM schemes with good coding 
and shaping gains. These codes can be decoded using the Viterbi Algorithm where the branch metrics are 

'We use the term "structured" to distinguish these codes from the random lattice approach of [1]. 
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computed using a low complexity MMSE-GDFE lattice decoder. We show construction examples based 
on the Gosset lattice Eg and lattices drawn from the Golden+ algebra [12] that yield, to the best of the 
authors' knowledge, the current state-of-the art performance among codes with similar encoding/decoding 
complexity. 

In Section |II] we review LaST codes and ST codes from CDAs, as these form the two main ingredients 
for our construction. We also review some concepts relating to lattice packings that will be used sub- 
sequently. Code design for the short block-length case is presented in Section [Till and Section [TV] deals 
with the construction of TCM schemes. Simulations results are provided alongside each construction, 
and illustrate the effectiveness of the constructions. 

II. Background 

A. Lattice Space-Time (LaST) codes 

An n-dimensional real lattice A is a discrete additive subgroup of W 1 defined as A = {Gu : u G Z™}, 

where G is the n x n (full-rank) real generator matrix of A. The fundamental Voronoi cell of A, denoted 

as "V(A), is the set of points x G M. n closer to zero than to any other point A € A. The fundamental 

volume of A is 

V f (A) = V(V(A)) = / dx = ,/det(G T G). 
JV{A) v 

An n-dimensional lattice code C(A, uo,3i) is the finite subset of the lattice translate A + uo inside the 
shaping region 01, i.e., C = {A + uo} n 01, where 01 is a bounded measurable region of M. n . 

LaST codes are more easily illustrated by considering the real vectorized channel model equivalent to 

o, 

y = Hx + w, (3) 

where x G M. 2MT and y , w G M 27VT denote respectively the vector equivalents of X c , Y c and W c obtained 

T Re(H c ) -Im(H c ) 
by separating real and imaginary part and by stacking columns, and where H = Iy® 

Im(H c ) Re(H c ) 
according to the well-known construction as in [1]. We say that an MxT space-time coding scheme X is a 

full-dimensional LaST code if it's vectorized (real) codebook (corresponding to the channel model in (O) 
is a lattice code S(A, u , 01), for some n-dimensional lattice A, translation vector u , and shaping region 
01, where n = 2MT. Given the equivalence of the real vector and the complex matrix representation 
of X, we shall not distinguish between them explicitly and write simply X = C(A, uo,3l). Any linear- 
dispersion ST code, including the constructions of [2], can be represented as a LaST code, for a suitable 
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shaping region. For later use, we define the lattice quantization function as 



Qa(y) 



argmin ly — Al 
AeA 



and the modulo-lattice function 



[y] mod A = y - Qa(y)- 



We also define the notion of a non-vanishing determinant (NVD) for an infinite LaST code (i.e., disre- 
garding the shaping region 31) as follows. A LaST code has the NVD property if and only if the minimum 
determinant corresponding to its infinite lattice A is bounded away from zero by a constant independent 
of SNR, i.e.El 



mm 

AX C = X? 



det 



X?, 



c\H 



AX C (AX C ) 



> SNR . 



Xj / Xj, Xi,Xj G A + u 
Notice that since A is a lattice, this is equivalent to 

X C (X C ) H 



min det 

xGA+Uo 



> SNR 



o 



B. ST Codes from CDA 

For a detailed exposition of ST codes from CDA, we refer the reader to [24], [2] and references 
therein. We provide a very brief review in the sequel. Let Q denote the field of rational numbers and 
i = V— 1. Set F = Q{%). The construction of a CDA calls for the construction of an n-degree cyclic 
Galois extension L/F with generator a. Then a CDA D(L/¥, a, 7) with center F, maximal subfield L 
and index n is the set of all elements of the form ^27=0 z% ^' where z is an indeterminate satisfying 
Iz = za(£) V £ G L and z n = 7. The element 7 needs to be a properly chosen non-norm element in 
order to ensure that D is a division algebra, see [24], [2] for details. Every element in the CDA can be 
associated with an n x n matrix through the left regular representation, which is of the form 

£ 7<r(4-i) 7^ 2 (4-2) ... lo r ' 
£1 a(£o) 7^ 2 (4-i) ... 



' n - l (£i) 



7^ n - 1 (^2) 



^n-l 



a{£. 



n-2 



a 2 {I 



n-3) 



<T n ~ l (£o) 



(4) 



2 We make use of the exponential equality notation from [9], denned as 



• -b ^ , ,. log a 
a = p -^ b = — lim . 

p^oo log/9 



The notations > and < are defined similarly. 
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where t- t G L. The trace and determinant of the above matrix are respectively defined to be the reduced 
trace tr r (-) and reduced norm N r {-) of the element it represents. The ST code with M = T = n is a 
finite collection of matrices of the above form, scaled to satisfy the power constraint in ©. Choosing 
7 £ Z[i] and restricting the li to belong to the ring of integers Ol of L bestows the NVD property on 
the ST code. One such choice for the li corresponds to choosing 



2_^ e i,kPk, e^k E Aqam, (5) 



fc=l 
with A QAM = {a + ib \ — Q + 1 < a, b < Q— 1, a, b odd }, and where /%, k = 1, 2, . . . ,n is an integral 

basis (i.e., a basis as a module) for Ol/Of- More generally, we could choose {Pk}k=i to constitute an 

Op-basis for any ideal 3 C Ol. In this case, |X| = Q 2n ' . The results of [2], [3] show that codes derived 

from CDA with NVD are approximately universal. 

In the recent work [12], ST codes are obtained from maximal orders in CDAs. For the sake of later 
use, a brief review follows. A Z[i\— order in an F— algebra D is a subring O of D, having the same 
identity element as D, and such that O is a finitely generated module over Z[z] and generates D as a 
linear space over F. 

An order O is called maximal if it is not properly contained in any other Z[i\— order. The discriminant 
of a Z[i\— order O is computed as d(0/R) = det([tr r (6j6j)]^ =1 ), where {&i, . . . , b m } is any Z^]— basis 
of O. 

All maximal orders of a CDA share the same value of the discriminant, and also have the smallest 
possible discriminant among all orders within a given CDA. An important property of elements of an order 
of a CDA Z?(L/F, a, 7) is that their reduced norm (i.e., the determinant of their matrix representation) is 
an element of the ring of integers Of = Z[z] of the center F. This property ensures that ST codes carved 
out of orders in suitably constructed CDAs are endowed with the NVD property. The choice of a subset 
of elements of D corresponding to ((5]) amounts to choosing a particular order O known as the natural 
order. 

It is established in [12] that the discriminant of an order in a CDA is directly proportional to the 
fundamental volume of the ensuing lattice (they are in fact equal for the case when the center of the 
CDA is F = Q(z)). Therefore, in order to maximize the energy efficiency of the code, a sensible design 
guideline is to use the maximal order of the CDA to derive ST codes, owing to them having the minimum 
possible discriminant. All previous constructions of ST codes from CDAs, including the ones in [24], 
[2], [4], [11], [5] have used the natural order, which is not guaranteed to be maximal in general. 

As an illustration of the technique, the authors in [12] construct a 2 x 2 ST code derived from the 
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maximal order of a CDA named the Golden+ Algebra (S-A+), whose minimum determinant improves 
upon that of previously known constructions. We will revisit this construction subsequently in Section. JIIJ 
and use it to construct some of our examples. 

C. Lattice Packings 

The classical sphere packing problem is to find how densely a large number of identical spheres can 
be packed together in n-dimensional space. A packing is called a lattice packing if it has the property 
that the set of centres of the spheres forms a lattice in ?7,-dimensional space. An excellent reference for 
this area is the book by Conway and Sloane [6], 

The density A of a lattice packing is given by 

A = Proportion of space that is occupied by the spheres 
volume of one sphere 
Vf(£) ' 

A related quantity is the center density 5, given by 

where V n is the volume of an n-dimensional sphere of radius 1, given by 

W 2 _ 2">- 1 )/ 2 ((n-l)/2)! 
n ~ fa/2)! ~ n\ 

(the second form avoids the use of fa/2)! when n is odd). A related parameter is the. fundamental coding 

gain 7c (A), defined as: 

*(A) ^/» = i^l, (6) 

where d m i n (A) denotes the minimum distance of the lattice A. It is evident from the definition that the 
fundamental coding gain is a normalized measure of the density of the lattice. Further, the fundamental 
coding gain also possesses the desirable properties of being dimensionless, and invariant to scaling and 
any orthogonal transformation (rotation) [8]. For the cubic lattice, 7 c (Z n ) = 1. 

The problem of finding dense packings (i.e., those with high values of 7 C (A)) in n-dimensional space 
has a long and interesting history. In two dimensions, Gauss proved that the hexagonal lattice is the 
densest plane lattice packing, and in 1940, L. Fejes Tdth proved that the hexagonal lattice is indeed the 
densest of all possible plane packings. In 1611, the German astronomer Johannes Kepler stated that no 
packing in three dimensions can be denser than that of the face-centered cubic (f.c.c.) lattice arrangement 
which fills about 0.7405 of the available space. It took mathematicians some 400 years to prove him 
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right, with Thomas Hales proving the conjecture in 1998 (Gauss showed in 1821 that the f.c.c. lattice is 
the densest possible lattice packing in three dimensions). The densest possible lattice packings are known 
for all dimensions n < 8. The checkerboard lattices D^ and D5 are the densest possible lattice packings 
in 4 and 5-dimensions respectively while Gosset's root lattices Eq, Ej and E$ are optimal among lattice 
packings in 6, 7 and 8-dimensions. It is also known that the densest lattice packings in dimensions 1 to 
8 are unique. Although not proven, it seems likely that Coxeter-Todd lattice K12, the Barnes-Wall lattice 
Aie = BWie an d the Leech lattice A24 are the densest lattices in dimensions 12, 16 and 24 respectively 
[6]. Tables of the best known lattice packings in n-dimensions are available in the literature [6] and in 
the online catalogue of lattices [7]. 

For later use, we define a lattice A with generator matrix G to be an integral lattice if the Gram matrix 
A = G T G has integer entries. It turns out that many of the best known lattices in terms of packing 
belong to this class, when suitably scaled. 

III. The Structured LaST Code Construction 

This section deals with code design for the case of short block-lengths, i.e., T is equal to or slightly 
larger than M. Before we present the construction, we first explore the LaST formulation of space-time 
codes derived from CDA. 

A. CDA ST Codes as Lattice Codes 

We will illustrate the equivalent lattice structure with an example of a 2 x 2 ST code derived from 
CDA. From (0]), any codeword matrix is of the form 

The real vector corresponding to X c in the equivalent channel model of (f3]) is given by 

T 



x c 



where 



x= Re(x c ) T Im(x c ) T 

x c =[!o47^i)4)]' eC 4 
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Let {/3i,/?2} denote an integral basis over Z[i] for some ideal 3 C Ol- Then, in accordance with (O, X c 
represents a point in the (complex) lattice whose generator matrix is given by 



G c 



01 


02 














01 


02 








-T<r(Pi) 


icr{0 2 ) 


°(0i) 


°{02) 









(V) 



i.e., 



x c = G c [ai a 2 a 3 a 4 ] T , {aj}| =1 G Z(z). 



The corresponding real lattice generator matrix is given by 

Re(G c ) -Im(G c ) 

Im(G c ) Re(G c ) 

It is now evident that the choice of parameters 7 and {/3i, $2} completely determines the lattice structure 
of the ST code (assuming a particular generator a for the group of automorphisms). Furthermore, the 
choice of these parameters in conjunction with © amounts to the choice of a particular subset L of Ol 
to be the signaling alphabet. The key to ensuring good constellation shaping lies in an intelligent choice 
of the non-norm element and the integral basis. 

In [4], these parameters are chosen to ensure that the resultant lattice generated by G is a rotated 
version of the cubic lattice 1? MT , i.e., that G is a unitary matrix. The cubic shaping is in fact the best 
possible shaping that we can obtain by a linear encoder over the reals (linear-dispersion code). No shaping 
gain can be achieved by a linear map: at most, the encoder does not increase the transmit energy. This is 



indeed obtained by G unitary, that is an isometry of . 



»2A/T 



. The authors in [4] provide such constructions 



for 2 x 2, 3 x 3, 4 x 4 and 6x6 (square) ST codes with NVD and have termed the resultant ST codes as 
perfect codes. More recently, [5] presented perfect ST code constructions for arbitrary number of transmit 
antennas and also for the rectangular case (T > M). 

B. The S-LaST Construction 

We wish to obtain LaST codes with the following properties: 

1) the NVD property; 

2) the underlying lattice A c (referred to as the coding lattice in the following) has large fundamental 
coding gain 7 C (A C ) (see ©); 

3) the shaping region 01 is as close as possible to a sphere. 
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We term the resulting codes as Structured-LaST (S-LaST) codes. The third property yields good shaping 
gain j s , defined as the ratio of the normalized second moment of an n-dimensional hypercube to that 
of the shaping region 01. If the shaping region is an n-dimensional hypercube, as in the case of perfect 
codes, then 7 S = 1. Choosing a better shaping region 51 does not change the geometric arrangement of 
the lattice points, but the average transmitted energy is decreased thanks to shaping. The above three 
requirement are simultaneously achieved using a nested lattice (Voronoi) construction and a non-linear 
modulo-lattice encoder nicknamed sphere encoder^ 

Let G p denote the generator matrix of a perfect code (unitary), and let Ga denote the generator matrix 
of a good 2MT-dimensional integral lattice A, that is, a lattice with large fundamental coding gain (such 
lattices are available in the literature [6]). Define A c to be the lattice with generator matrix Ga c = G p Ga 
and let A s (referred to as the shaping lattice) be a sublattice of A c such that A s has good shaping gain. 
Let [A C |A S ] denote the nesting ratio, that is, the cardinality of the quotient group A c /A s . 

Then, we construct a structured LaST code X as the set of all distinct points x given by 

x = [A + Uo] mod A s 

as A varies in A c , and uo is a translation vector used to symmetrize the code. 

Although not necessary, in all cases considered in this paper we let A s = QA C , Q G Z + for simplicity, 
i.e., we use a self-similar shaping lattice. The rationale behind this choice is that it is well-known that 
for moderate dimensions, the best lattices with respect to coding gain are also good quantizers, i.e., have 
good shaping gain. The coding rate is given by R = ^ log[A c |A s ] = 2M log Q. Notice also that because 
of the "rotation" matrix G p and the fact that A is an integral lattice, the set of points X represented as 
complex matrices has the NVD property. 

Theorem 1: The space-time code X derived from the lattice Ga c = G p Ga using a nested-lattice 
structure corresponds to a space-time code derived from CDA with non-vanishing determinant and hence 
achieves the optimal DMT over any fading channel statistics. 

Proof: Recall that G p corresponds to a ST code with NVD, i.e., the set of all non-zero lattice 
vectors z G G p l? , represented as complex matrices Z c , have det [Z C (Z C ) H ] bounded away from zero 
by some constant term SNR° (up to order of exponent of SNR). Since A is an integral lattice, there exists 

3 Tree-search algorithms to perform the Closest Lattice Point Search (CLPS), based on Pohst enumeration [26] and generalized 
in [22], [23] are generally nicknamed "sphere decoders" if used for minimum distance lattice decoding or "sphere encoders" 
if used for modulo-lattice precoding, in the current communication and coding theoretic literature. The reason of the nickname 
follows from the bounded-distance enumerative decoding of the Pohst lattice point enumeration and variants thereof. 
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Fig. 1. Illustrating the Sphere-Encoder: Hexagonal Lattice, Q = 16, linear map (left) and sphere-encoded map (right) 



afcel such that A;Ga generates a sublattice of 1? MT . It follows that the LaST code kX generated by 



feG„GA is a sublattice of G„Z 2 and therefore satisfies 



min det(XX H ) > k~ 2M SNR° = SNR°. 
XeX: x^o 

The proof of DMT optimality now follows from [2], [3]. ■ 

The modulo- A s "sphere-encoder" is easily implemented by some CLPS, using some "sphere decoding" 

algorithm [22], [23]. The shaping effect of sphere-encoding is best illustrated using a 2-dimensional 

example. Suppose that A c is the hexagonal lattice in two dimensions. Set Q = 16. The constellations 

corresponding to the linear map (centred at the origin) and the sphere-encoder are shown in Fig. Q] As 

the value of Q increases, the sphere-encoded constellation fills the fundamental Voronoi region of the 

hexagonal lattice uniformly. Although both constellations correspond to signalling from the hexagonal 

lattice, the energy saving of the sphere-encoder is evident. 

Example 1: (The Golden-Gosset S-LaST code) When M = 2, we choose G p to be the lattice 

generator matrix of the Golden code [11] and Ga to be the generator matrix of the Gosset lattice Eg, 

which are respectively given by 

Q = J_ I" Re(G£) -Im(Gp) 

V5 i m{G c p) Re(G c } 
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Example 2: (The GoIden+ Algebra (S-A+) S-LaST code) Our second example is based on a 2 x 2 
ST code derived from a maximal order of a CDA [12]. The Golden+ algebra [12] is defined to be 
SA+ = (Q(5)/Q(i),a, i), where 5 is the first quadrant square root of 2 + % and the automorphism a 
is determined by a(S) = —5. The maximal order O of QA+ is generated by the following ordered 

Z(?)— basis: 



1 




1 


1 


1 


5 


i 


'2 



-l-i5 
-1 + 5 



7 + 7(5 
-1 + 7(5 



7 + 7(5 1 — 8 

-l-\-i8 i — iS 
The Golden+ code [12] corresponds to the left ideal of the maximal order generated by 

(1 - 5) 3 

M 



(8) 



(9) 



(l + <5) 3 

In this case, we choose Ga to be the lattice generator matrix corresponding to this left ideal of the 
maximal order and G p = I (trivial rotation). Notice that this choice does not maximize the fundamental 
coding gain (the Golden-Gosset S-LaST code has a higher density), but the minimum determinant of 
the Golden+ S-LaST code is better than that of the Golden-Gosset code. It is a priori not clear which 
effect will dominate the performance in terms of error probability; this will be answered in the simulation 
results to follow. 
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C. Performance under low -complexity MMSE-GDFE Lattice Decoding 

Unfortunately, due to the usage of a non-linear encoding to achieve shaping gain, ML decoding of 
the resulting code is very complicated, requiring essentially the exhaustive enumeration of the whole 
codebook. Notice that a similar problem arises in the case of the 9-A+ code in [12], where linear 
encoding would result in very bad shaping. The authors in [12] have obtained shaping by enumerating 
the minimum energy codewords and perform exhaustive decoding, both these are feasible only for low 
spectral efficiencies. 

Hence, we resort to suboptimal MMSE-GDFE lattice decoding (see [1], [22] for details). It has been 
proven that this decoder achieves the optimal DMT in the random coding sense, for a specific ensemble 
of random lattices. Here, we use it with our deterministic non-random constructions. We do not claim that 
the resulting schemes achieve the optimal DMT under lattice decoding. Nevertheless, the performance of 
these codes is outstanding. In our simulations, we make use of a random translation vector uo, uniformly 
distributed over a very large hypercube with volume much larger than the volume of the shaping region. 
This random "dithering" is known to the receiver, and is subtracted before decoding, as explained in [1]. 
With this "trick", we ensure that the transmitted points have energy exactly equal to the second moment 
of A s and have exactly zero mean. Furthermore, dithering symmetrizes the scheme and makes the error 
probability independent of the transmitted codeword. 

Fig. |2] compares the performance of two 2 x 2 ST codes derived from CDA with R = 16 bpcu and 
N = 2. The two ST codes chosen in this case have 7 C (A C ) equal to 0.8365 and 1.4142 respectively. 
Sphere encoding and MMSE-GDFE lattice decoding are used in both cases. We notice about one dB of 
gain due to better fundamental coding gain of the lattice. 

In order to illustrate the benefit of constellation shaping, we plot in Fig. [3] the performance of a (2 x 2) 
ST code derived from CDA first using linear encoding of the information symbols and ML decoding and 
then using sphere encoding and MMSE-GDFE decoding (R = 16 bpcu, N = 2). The particular ST code 
chosen has 7 C (A C ) = 0.8365. Quite a significant gain of about 3.5 dB results from codebook shaping in 
this particular case. 

For the case of M = 2, we compare the performance of the Golden Code [11], which is a perfect 2x2 
ST code (with 7 C (A C ) = 1), with the Golden-Gosset 2x2 S-LaST code from Example [Q (j c (E 8 ) = 
2). Fig. |4] shows plots of the Golden code under ML decoding and MMSE-GDFE lattice decoding in 
comparison with the Golden-Gosset S-LaST code with MMSE-GDFE lattice decoding at rates of 4 
and 16 bpcu. At 4 bpcu, the (real) information symbol constellation corresponds to BPSK signaling on 
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Fig. 2. Effect of fundamental coding gain on performance: 2 x 2 ST codes derived from CDA, 16 bpcu, N — 2, MMSE-GDFE 
lattice decoding 
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Fig. 3. Effect of shaping gain on performance: 2 x 2 ST code derived from CDA, 16 bpcu, N = 2 
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Fig. 4. Comparing the Golden Code with the Rotated Gosset Lattice ST Code, N — 2 



each dimension (Q = 2). In this case, the signal points of the Golden code in 8-dimensional space lie 
on the surface of a sphere (they are vertices of the rotated hypercube). Therefore, the 2x2 perfect 
code construction is optimal for 4 bpcu also in terms of shaping. This intuition is verified by the plots 
corresponding to 4 bpcu in Fig. @l However, when the number of bits per channel use increases, the effect 
of the coding gain of the lattice and the shaping gain begin to show up. At 16 bpcu, the Golden-Gosset 
S-LaST code with MMSE-GDFE lattice decoding (marginally) outperforms the Golden code with ML 
decoding (see Fig. [4]). These plots also serve to illustrate that MMSE-GDFE lattice decoding is near-ML 
in performance, while offering significant reductions in complexity. 

In Fig. [5J we present comparisons of the Golden code with ML decoding, the Golden-Gosset S-LaST 
code (see Example [T]) and the S-A+ S-LaST code (see Example [2]), at 16 bpcu. While the fundamental 
coding gain of the lattice corresponding to the S-A+ code is less than the coding gain of Eg, the loss in 
density is compensated for by an increase in the minimum determinant. Both the Golden-Gosset and the 
S.A+ S-LaST codes with MMSE-GDFE lattice decoding outperform the Golden code with ML decoding. 

For the 3x3 case, we compare the performance of two perfect codes from [5] and [4] (with base 
alphabets QAM and HEX respectively) with an S-LaST code based on a rotated version of the Aig lattice, 
which is the best known lattice packing in 18-dimensions [6]. MMSE-GDFE lattice decoding is used for 
all cases. The results shown in Fig. [6] show a significant gain for both 6 and 24 bpcu resulting from the 
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Fig. 5. Performance of the 2x2 Golden code, Golden-Gosset and %A+ S-LaST codes at R = 16 bpcu. The inset shows a 
portion of the plot zoomed for clarity. 



increased lattice coding gain and shaping. 

In Fig. [7] we compare the performance of the 2x2 Golden-Gosset S-LaST code (T = 2) with rectangular 
2x4 and 2x6 S-LaST codes constructed using the horizontal-stacking construction [2] in conjunction 
with the Barnes-Wall (Ai 6 ) (7 c (Ai 6 ) = 2.8284) and Leech (A24) (7^24) = 4) lattices respectively. The 
length-24 cyclic code 324(^4) constructed in [10] was used to construct an isomorphic version of the 
Leech lattice using construction-A [6]. MMSE-GDFE lattice decoding is used for all three ST codes. In 
accordance with intuition, the performance approaches outage probability as T increases, owing to better 
values of 7 C (A C ). 

IV. The S-LaST TCM Scheme 

Motivated by the fact that in practical wireless communications M is limited by transmitter complexity 
to be a small integer (typically 2 or 4, in current IEEE802.11n MIMO extension of wireless local area 
networks) while T may be of the order of 100 channel uses, our objective in this section is to construct 
M x T ST codes for the case of T > M. For ease of exposition and without loss of fundamental 
generality, we will focus on the case where T = LM, for some integer L. TCM has the nice feature that 
a single trellis code can generate any desired block length, with decoding complexity linear in L, using 
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Fig. 6. 3 x 3 ST Codes under MMSE-GDFE lattice decoding, N = 3 



10" 



o 



i io- 3 

o 

o 



33 




Rotated Gosset Lattice, T = 
Rotated BW-1 6 Lattice, T = 
Outage (16 bpcu) 
Rotated G„.(Z.) lattice, T = 6 



37 38 

SNR (dB) 



Fig. 7. Increasing the Coding Length, M = N = 2, T - 2, 4, 6, R - 16 bpcu, MMSE-GDFE lattice Decoding 



April 10, 2008 



DRAFT 



SUBMITTED TO IEEE TRANS. INFORM. THEORY, APR. 2008 



18 









"Coded" 
bits 










DE-MUX 




2M 2 


Mod 

A b 

Enc. 








Conv. 
Encoder 




Coset 
Selector 

VAr, 






















l/p 








(MxM) 

ST 
Formatter 


ST , 


Bits 






Matrix 






Point 

Selector 

A m 


2M 2 
























4 










"Unco< 
bit! 


led" 











Fig. 8. S-LaST TCM Encoder 



a Viterbi decoder. Furthermore, the construction of TCM schemes is rather well understood and a rich 
literature exists for the Gaussian channel (see [13], [14], [15] and references therein), the scalar fading 
channel (see [16] and references therein) and for the MIMO fading channel [17], [18], [19]. 

A. Encoder 

Consider a three level partition A t D A m D A& (where the subscripts indicate 'top', 'middle' and 
'bottom') of lattices in IR n , with n = 2M 2 . Let [A t |A m ] = M and let the cosets of A m in A t be indicated 
by Cj = {vj + A m }, for i = 1, . . . , M, where each v, is a coset representative of Cj. From each coset Cj, 
we carve a finite set of N points, denoted by {vj + Cj : Cj G A m , j = 1, . . . , N}. These points are chosen 
via a modulo-Ab sphere encoder, that will be described in the following. Also, we choose A& such that 
N = [A m |Afe]. In all the examples presented here, we use Aj = QA m , for some Q G Z + (i.e., we use 
again a self-similar shaping lattice). In this case, N = Q 2M . 

We make use of Forney's general "coset coding" framework [8]. A block diagram of the encoder 
is shown in Fig. [8] During each block k = 1, . . . , L comprising of M channel uses each, a block of 
(logM)/r + logN information bits enters the encoder. The top (logM)/r information bits are input 
to a convolutional encoder of (binary) rate r, that outputs logM coded bits, which select the index 
ik € {!,••• ,M} of a coset in A t /A m . The remaining logN information bits select the index jk of a 
point in the finite constellation carved from the selected coset Cj fc . 
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The transmitted vector at time k is given by 

x fc = i c j k + v 4 + u k] mod A fe (10) 

where u^ is an optional random dithering signal known to the receiver, that serves to symmetrize the 
overall TCM code and to induce the uniform error property. The vector x^ is then mapped into an M x M 
complex matrix and transmitted in M channel uses across the MIMO channel. The rate of the S-LaST 

TCM scheme is given by 

(logM)/r + log^ 

K = bits/channel use. 

M 

It should be noticed that x^ = c Jfc + Vj fc + u^. — X k for some \ k G Aj that is a function of Cj k , Vj fc , u^. 

Further, x*. £ V(A^). Since [A m |A{,] = Ji, the mapping between the uncoded bits and the constellation 

points in each coset is one-to-one. 

B. Decoder 

The (real equivalent) received point at each block k is given by 

y k = Hx fc + w fc , 

for k = 1, . . . , L. In general, the trellis of the S-LaST TCM scheme has N parallel transitions per trellis 
branch, corresponding to the N points in the intersection CjO V(A^), on each branch labeled by the coset 
Cj. Consider time k, and a branch labeled by coset Q. L . The corresponding branch metric for a ML trellis 
decoder (implemented via the Viterbi algorithm) is given by 

B ik = min |y fe - H(vj + c + u fe )| 2 . (11) 

ceA m nV(Ai,) 

Computing this branch metric amounts to exhaustive enumeration of all points of A m in the Voronoi 

region V(Aft) of the shaping lattice. 

Since exhaustive enumeration is usually too complex, we resort once again to a suboptimal MMSE- 

GDFE lattice decoder along the lines of [1], in order to compute an approximate ML branch metric for 

the Viterbi decoder. First, we relax the minimization in ([TT1 ) to take into account all points of A m (Lattice 

decoding), i.e., we consider the suboptimal branch metric 

B itk = min \y k - H(v; + c + u fc )| 2 . (12) 

ceA m 

This amount to solving a CLPS problem for the channel-modified lattice HA m , with respect to the point 

y k — H(vj + u k ), where u^ is a known dithering vector and v; depends on the label of the branch for 
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which we compute the metric. The surviving path among the parallel paths corresponds to the argument 
c that minimizes (fT2l) . 

Then, we further modify the suboptimal metric following the MMSE-GDFE paradigm (see [1] for the 
details). Let F and B denote the forward and backward filters of the MMSE-GDFE as defined in [1]. 
At each time k, the receiver obtains the following set of modified channel observations 

y / i)fc = Fy fc -B(v i + u fc ), l<i<M. 
Using the properties of the matrices F and B, these can be written as 

y'i,k = F[H(c ifc +v ifc +u fc - A fc )+w fc ]-B[u fc + v J ] 

= B(c ifc + v jk - A fe - Vj) - [B - FH](c it + v ik - X k + u fc ) + Fw fc 

= B(c ifc + w jk -\ k - Vi) - [B - FH]x fc + Fw fc 

- B ( c j* + v i, - A fc - v £ ) + e' k . 

Notice that x& is uniformly distributed over V(Ab) and is hence independent of Cj k and v Jfc [1]. It can 
be shown that the noise plus self-noise vector e' k has the same covariance matrix of the original noise 
Wfc, although it is generally non-Gaussian. Also, Vj fc — Vj = (i.e., it belongs to A m ) if i k = i, while it 
belongs to some coset of A m in A t not equal to A m if i k ^ i. 

For each branch labeled by coset Cj, the low-complexity Viterbi decoder computes branch metric 

B ik = min ly- k - BG Am z| 

zSZ 2m2 

where Ga„, denotes a generator matrix for A m . This can be obtained by a sphere decoder applied to the 
channel-modified lattice BA m . It is clear that the branch metric for the correct coset (i.e., for i = i k ) 
will be smaller than the branch metric for an incorrect coset, with high probability. 

C. Construction of suitable lattice partition chains 

In order to ensure good performance, we choose the component M x M code of the S-LaST TCM 
scheme to be approximately universal. We will therefore choose A t to be the lattice corresponding to 
an ST code derived from CDA with NVD. In order to construct A m and A&, we will first discuss the 
important special case when At corresponds to a perfect code, and then treat the more general case. 
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1) Partitions of perfect codes: Let A t be the lattice corresponding to a perfect code [4], [5], with 
generator matrix G p . Then, A t is a rotated version of the cubic lattice I? M . Following what was done 
before for the case of short block codes, we choose A m to be the best known integral lattice packing 
in 2M 2 — dimensional space, rotated by G p . Also, we set A& = QA m . For example, when M = 2, we 
choose A m to be the Golden Gosset lattice. The resulting code shall be named the Golden-Gosset S-LaST 
TCM scheme. 

2) S-LaST TCM from maximal orders in CDAs: We choose At to be the lattice corresponding to the 
maximal order of a given CDA. An example for the case when M = 2 would be the lattice corresponding 
to the QA+ code that we made use of for the short block-length case in Example|2] Similar to the approach 
used in [20], [21] for the cubic lattice case, we will use ideals j30 of the maximal order for the sublattice 
A m . The element [3 yielding a good sublattice is obtained through a computer search, that makes use of 
the following lemma. 

Lemma 2: Let D(L/Q(i), a, 7) be a cyclic division algebra of index n, and let O denote an order of 
D. If (3 is an element of the order, then 

[0\(30] = \N r ((3) n \ 2 . 
Proof: Although this lemma is well known to the mathematics community, we provide a sketch of 
the proof for completeness. Consider any (3 G O. Then (3 induces a transformation on O with image (30. 
These are finitely generated free modules over Z, and so the index of partition is just the determinant of 
(3 in this action. 

We may compute the determinant over the corresponding field. D has rank 2n 2 over Q. First viewing 
D as a (right) vector space of dimension n 2 over Q(z), we see that the determinant of multiplication by 
j3 is N r {j3) n . We then apply the norm from Q(i) to Q to obtain the determinant. ■ 

The computer search performs the following: 

1) Fix a desired index of partition M = [Af|A m ], and a sufficiently large integer v. 

2) Let O u denote the integral closure of {— v, —v + 1, . . . , v — 1, v\ C Z in O. More specifically, if 
71,72, • • • ,72M 2 constitutes a basis for O over Z,, then 

{2M 2 ^| 

^ QUi -v <gi<v, gi e Z V i \ . 

Notice that such a basis always exists, since every algebraic number field has at least one integral 
basis [25]. 

3) For each (3 e O v that generates a partition with required index M, i.e., satisfying |A r (/?) M | = M, 
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compute the fundamental coding gain of the lattice corresponding to (30, and let /3 max denote a 
maximizes 
4) Set A m to be the lattice corresponding to (3 max O. 
Finally, as before, we use the self-similar shaping lattice A& = QA m , for some Q 6 Z + . 

D. Code construction examples 

In this section, we present two construction examples of S-LaST TCM, the performances of which are 
compared by simulation. 

• The Golden-Gosset S-LaST TCM construction (see Example [1J: here A t = G P Z 8 , A m = G P E$ 
and A b = QA m , Q £ Z + . 

• The S.A+ S-LaST TCM construction: we choose At to be the lattice corresponding to the SA+ S- 
LaST code in Example |2] A m is obtained using the computer search given above, and corresponds 
to the left ideal of (3 2 generated by M (given in ©X where O is the maximal order of the 
QA+ algebra (see Example [2]) and the coordinates of (3 in terms of the ordered basis in ([8]) are 
(-1, -1, 1 - i, -1 - i). We then set A b = QA m , Q G Z + . 

Both these codes correspond to a 16— ary partition A t /A m , as shown in Fig. [9] The minimum determinant 




12 13 14 15 



Fig. 9. Two level partition of the example constructions 

increases as one goes down the partition chain. We use the trellis shown in Fig. [10] that is designed such 
that the transitions leaving/merging into a state have maximum possible minimum determinant. 

In our simulations, we have used block length T = 260 channel uses, corresponding to 1300 information 
bits per packet, at R = 5 bpcu. Fig. [TT] shows the performance in terms of packet error probability of the 
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Fig. 10. 16-state trellis used for the example constructions 



above two S-LaST TCM schemes in comparison with the Golden ST TCM (GST-TCM) scheme [21] at 
5 bpcu. Also shown is the performance of the "uncoded Golden code" construction [21], which consists 
of stacking 130 Golden code matrices next to each other (coding is performed only over 2 time-slots). 
The proposed S-LaST TCM construction is seen to gain around 1 dB over the GST-TCM scheme. 

V. Conclusions 

In this paper, we have advocated the use of structured lattices that are endowed with good packing 
and shaping properties in the design of space-time codes with both short and long block-lengths. The 
constructions presented have reasonable decoding complexity, and exhibit excellent performance in terms 
of error probability. 

Quite a few research topics occur naturally as potential follow-up works. While codes with short block- 
length have performances that are very close to the outage probability, there is still quite a significant 
gap from outage for the case of long block-lengths. Designing better codes for this scenario remains a 
challenging open problem. It would also be interesting to explore if there exist better algebraic frameworks 
that allow us to choose sublattices with good packing and shaping properties. 
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